Video Streaming at Scale π₯
Core Conceptβ
Key Insight: Streaming platforms don't fetch entire videos firstβthey use chunked delivery and adaptive streaming for immediate playback.
1. Streaming vs Downloadingβ
| Method | Approach | User Experience |
|---|---|---|
| Downloading | Fetch entire file first | Wait minutes for GB files |
| Streaming | Fetch small chunks progressively | Playback starts in seconds |
Why streaming wins: Videos can be 1-10GB+. Nobody waits 5 minutes to start watching.
2. Video Processing Pipelineβ
Step 1: Encoding & Segmentationβ
# Original video processing
original_video.mp4
βββ 144p/ (low bitrate)
βββ 360p/ (medium bitrate)
βββ 720p/ (high bitrate)
βββ 1080p/ (HD bitrate)
βββ 4K/ (ultra-high bitrate)
# Each resolution split into chunks
720p/
βββ chunk_001.ts (2-10 seconds)
βββ chunk_002.ts (2-10 seconds)
βββ chunk_003.ts (2-10 seconds)
βββ ...
Step 2: Manifest Creationβ
HLS Example (.m3u8 playlist):
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-TARGETDURATION:10
#EXTINF:10.0,
chunk_001.ts
#EXTINF:10.0,
chunk_002.ts
#EXTINF:10.0,
chunk_003.ts
3. Adaptive Bitrate Streaming (ABR)β
How It Worksβ
- Monitor network speed continuously
- Switch quality based on bandwidth
- Seamless transitions between resolutions
ABR Decision Logicβ
if (bandwidth > 5Mbps) {
quality = "1080p"
} else if (bandwidth > 2Mbps) {
quality = "720p"
} else if (bandwidth > 1Mbps) {
quality = "480p"
} else {
quality = "240p"
}
Real-World Exampleβ
User starts video:
chunk1_360p.ts β plays immediately
chunk2_720p.ts β bandwidth good, upgrade
chunk3_720p.ts β continues HD
chunk4_480p.ts β network congestion, downgrade
4. Streaming Protocolsβ
HLS (HTTP Live Streaming)β
- Created by: Apple
- Format:
.m3u8playlists +.tssegments - Support: iOS, Safari, most platforms
- Latency: 6-30 seconds
DASH (Dynamic Adaptive Streaming)β
- Created by: ISO standard
- Format:
.mpdmanifests + various segments - Support: Wide browser support
- Latency: 2-30 seconds
WebRTC (Real-time)β
- Use case: Live streaming, video calls
- Latency: Sub-second
- Trade-off: Higher complexity, less scalable
5. CDN Architectureβ
Origin Server (1x)
βββ Video processing
βββ Master storage
βββ Initial upload
Edge Servers (100s-1000s)
βββ Geographic distribution
βββ Cached video chunks
βββ Low-latency delivery
User Request Flow:
User β Nearest Edge Server β Chunk Delivery
CDN Benefitsβ
- Reduced latency: Serve from nearby servers
- Load distribution: Spread traffic across nodes
- Redundancy: Multiple copies prevent outages
- Bandwidth efficiency: Cache popular content locally
6. Client-Side Streamingβ
Buffer Managementβ
// Typical streaming player logic
const BUFFER_SIZE = 30; // seconds ahead
const MIN_BUFFER = 5; // minimum before stalling
function manageBuffer() {
if (currentBuffer < MIN_BUFFER) {
fetchNextChunks(3); // emergency fetch
} else if (currentBuffer < BUFFER_SIZE) {
fetchNextChunks(1); // normal operation
}
// else: buffer full, pause fetching
}
Quality Switchingβ
- Upward switching: Gradual (avoid wasted bandwidth)
- Downward switching: Immediate (prevent stalling)
- Chunk boundaries: Quality changes only between segments
7. Advanced Optimizationsβ
Byte-Range Requestsβ
GET /video.mp4 HTTP/1.1
Range: bytes=2048000-4095999
- Fetch specific byte ranges from single file
- Useful for seeking to specific timestamps
- Alternative to chunk-based streaming
Preloading Strategiesβ
- Predictive buffering: Fetch likely next chunks
- Quality pre-loading: Cache multiple resolutions
- Seek optimization: Pre-load keyframes for scrubbing
Modern Codecsβ
| Codec | Compression | Quality | CPU Load |
|---|---|---|---|
| H.264 | Standard | Good | Low |
| H.265/HEVC | 50% better | Excellent | Medium |
| AV1 | 30% better than H.265 | Best | High |
8. Platform Examplesβ
YouTubeβ
- Protocol: DASH primarily
- Resolutions: 144p β 8K
- Strategy: Start low (360p), upgrade quickly
- Innovation: VP9/AV1 codecs for efficiency
Netflixβ
- Protocol: Custom DASH implementation
- Pre-encoding: 15+ quality levels per title
- CDN: Custom Open Connect network
- Optimization: Per-title encoding optimization
Twitch (Live)β
- Protocol: HLS for playback, RTMP for ingest
- Latency: 3-5 second delay
- Transcoding: Real-time quality variants
- Challenge: Live = can't pre-process chunks
9. Performance Metricsβ
User Experience Metricsβ
- Startup time: Time to first frame (
<2starget) - Rebuffering ratio: % of playback time stalled (
<1%) - Quality switches: Frequency of resolution changes
- Bandwidth efficiency: Data used vs video length
Technical Metricsβ
- Chunk fetch time: Network request latency
- Decode performance: Client-side processing speed
- Cache hit ratio: CDN efficiency (>90% target)
- Origin load: Traffic reaching source servers
10. Implementation Considerationsβ
Client Implementationβ
// Modern HTML5 video with HLS.js
import Hls from 'hls.js';
const video = document.querySelector('video');
const hls = new Hls({
maxBufferLength: 30, // 30s buffer
maxMaxBufferLength: 60, // absolute max
startLevel: -1, // auto quality
capLevelToPlayerSize: true, // match screen resolution
});
hls.loadSource('playlist.m3u8');
hls.attachMedia(video);
Server-Side Considerationsβ
- Concurrent connections: Handle thousands of simultaneous streams
- Geographic distribution: CDN placement strategy
- Storage costs: Balance quality vs storage requirements
- Processing power: Real-time transcoding for live content
Key Takeawaysβ
β Never download entire videos β chunk-based delivery is essential β Adaptive quality β match network conditions dynamically β CDN distribution β serve from edge servers for low latency β Multiple protocols β HLS/DASH for different use cases β Buffer management β balance startup time vs smooth playback β Modern codecs β H.265/AV1 for better compression β Monitoring β track performance metrics for optimization
Bottom line: Streaming at scale requires sophisticated infrastructure, but the user just sees "instant" video playback.